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Abstract 

We study both numericaUy and analyticaUy what happens to a random graph of average connectivity 
a when its leaves and their neighbors are removed iteratively up to the point when no leaf remains. The 
remnant is made of isolated vertices plus an induced subgraph we call the core. In the thermodynamic limit 
of an infinite random graph, we compute analytically the dynamics of leaf removal, the number of isolated 
vertices and the number of vertices and edges in the core. We show that a second order phase transition 
occurs at a = e = 2.718 . . .: below the transition, the core is small but above the transition, it occupies a 
finite fraction of the initial graph. The finite size scaling properties are then studied numerically in detail in 
the critical region, and we propose a consistent set of critical exponents, which does not coincide with the set 
of standard percolation exponents for this model. We clarify several aspects in combinatorial optimization 
and spectral properties of the adjacency matrix of random graphs. 

Key words: random graphs, leaf removal, core percolation, critical exponents, combinatorial optimization, 
finite size scaling, Monte-Carlo. 



?3 1 Introduction 



What remains of a graph when leaves are iteratively re- 
moved until none remains ? The answer depends on what 
is meant by leaves. 

In the most standard definition, a leaf is a vertex with 
exactly one neighbor, and leaf removal deletes this vertex 
and the adjacent edge. In the context of large random 
graphs where the connectivity a (the average number of 
neighbors of a vertex) is kept fixed and the number of ver- 
tices N oo, the answer is well known and interesting. 
When a < 1, the remnant after leaf removal is made of 
0{N) isolated vertices, plus a subgraph of size o{N) with- 
out leaves. When a > 1, the remnant still contains 0{N) 
isolated points, but the rest is a subgraph of size 0{N), 
which is dominated by a single connected component usu- 
ally called the backbone. The a priori surprising, but 
rather general, fact that backbone percolation and stan- 
dard percolation occur at the same point, namely at a = 1, 
has a very simple explanation for random graphs. Indeed, 
a large random graph of average connectivity a < 1 con- 
sists of a forest (union of finite trees) plus a finite number 



of finite connected components with one closed loop. Ob- 
viously, each tree shrinks to a single isolated point after 
leaf removal. However, at a = 1 the percolation transi- 
tion occurs and when a > 1, a random graph consists of a 
forest plus a finite number of components with one closed 
loop, plus a "giant" connected component containing a fi- 
nite fraction of the vertices and an extensive number of 
loops. No loop is destroyed by leaf removal so that the gi- 
ant component leads to a macroscopic connected remnant 
after leaf removal. The percolation transition was dis- 
covered and studied by Erdos and Renyi in their seminal 
paper [0. This has initiated a lot of work on the random 
graph model, and many fine details concerning the struc- 
ture of the percolation transition have been computed (see 
e.g Ref. 1). 

The random graph model is believed to be essentially 
equivalent to a mean field approximation for percolation 
on (finite dimensional) lattices, leading to critical expo- 
nents which are valid above the upper critical dimension, 
which is dc = 6 for percolation. 

In this paper, we consider the removal of a slightly more 
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complicated pattern: we remove at each step not only the 
leaf but also its neighbor (and consequently all adjacent 
edges). To avoid cumbersome circumlocutions, in the rest 
of this paper, we call leaf the pair "standard leaf + its 
neighbor". Now leaf removal deletes two vertices (a ver- 
tex with a single neighbor and this neighbor) and all the 
edges adjacent to one or both vertices. It is quite natural 
to study the removal of these patterns because it has a 
number of applications to graph theory: several numerical 
characteristics of a graph behave nicely under leaf removal. 
One such characteristic, which was our original motivation 
from physics, is the multiplicity of the eigenvalue in the 
adjacency matrix of the graph. Others are the minimal 
size of a vertex cover and the maximal size of an edge dis- 
joint subset (the matching problem), questions which are 
related to various combinatorial optimization problems. 

The matching problem had already led mathematicians 
to a thorough study of leaf removal (see Refs. (|, ^ and 
references therein). In fact, parts of our analytical results 
have already been obtained in this context. However, we 
have obtained them independently by a direct enumera- 
tion technique which turned out to be quite similar to a 
counting lemma for bicolored trees that appeared in Q . 

The main result on the structure of the remnant af- 
ter iterated leaf removal when the graph is a large ran- 
dom graph of finite connectivity a is the following. The 
residue consists of i{a)N + 0(1) isolated points and an 
induced subgraph without leaves or isolated points which 
we call the core. It contains c{a)N + 0{1) vertices and 
l{a)N+0{l) edges. For a < e = 2.718 . . ., c(a) = l{a) = 
so the core is small. A second order phase transition oc- 
curs at a = e and for a > e, c{a) and l{a) are > 0. 
We shall argue that the core is made of a giant connected 
"core" component plus a finite number of small connected 
components involving a total of o(iV) vertices. The func- 
tion i{a) is always non- vanishing, but it is non-analytic at 
a — e. 

The phase transition at a = e was found initially for the 
matching problem Q . Physicists however have observed 
independently that some properties of random graphs are 
singular at a = e (see Ref. Q for replica symmetry break- 
ing in minimal vertex covers, Ref. for a localization 
problem, and, in a numerical context, Ref. Q where an 
anomaly close to the eigenvalue in the spectrum of ran- 
dom adjacency matrices was observed). 

The paper is organized as follows. The general defini- 
tions have been regrouped in section |^. They are standard 
and should be used only for reference. 

In section ^ we define leaf removal, leaf removal pro- 
cesses obtained by iteration of leaf removals and the "core" 
for an arbitrary graph. 

Section ^ presents our derivation of the main results for 
large random graphs. The analytical formula; for i{a), 
c{a) and l{a) are given. 

In section |5[ these formulae are checked against Monte- 



Carlo simulations of leaf removal processes, which we also 
use for the finite size scaling analysis in the critical region. 
This leads us to the definition and numerical evaluation of 
many new critical exponents. In particular, we give good 
evidence that the core percolation exponents (at a = e) 
are not the same as the critical exponents of standard 
percolation (at a — 1). Even if core percolation on a 
random graph can presumably be seen as a mean field 
approximation for core percolation on (finite dimensional) 
lattices with impurities, the corresponding effective field 
theory and its upper critical dimension are not known to 
us. 

In section 1^, we give two appl ications of our results. We 
show in particular in section 6T that for any a the core of a 
random graph only carries a small number of eigenvalues 
of the adjacency matrix and that the emergence of the 
core has a direct impact on the localized and delocalized 
eigenvectors with eigenvalue 0. In section 3.2, we show 
that for a < e, the problem of finding minimal vertex 
covers or maximal edge disjoint subsets (matchings) can be 
handled very simply in polynomial time (in fact, in linear 
time once the graph is encoded in a suitable form). While 
the matching problem can always be solved in polynomial 
time, the minimal vertex cover problem is believed to be 
NP-hard for general graphs, and the same should be true 
on the core of a random graph for a > e. 

The formal proof that the core is a well-defined object 
is given in appendix. 

2 General definitions 

We start with a few standard definitions. This section 
should be used only for reference. 

Graph: A graph (also called a simple undirected graph 
in the mathematical literature) G is a pair consisting of 
a set V called the set of vertices of G and a set E called 
the set of edges of G, whose elements are pairs of distinct 
elements of V. If {v,w} is an edge, the vertices v and w 
are called adjacent or neighbors. They are the extremities 
of the edge {v,w}. Note that there is at most one edge 
between two vertices, and that there is no edge connecting 
a vertex with itself: the word simple above refers to these 
two restrictions. 

Adjacency matrix: The adjacency matrix of a graph 
G is a square matrix My^^ indexed by vertices of G and 
such that My^w = 1 if {v, w} is an edge of G and oth- 
erwise. Note that M is a symmetric 0—1 matrix with 
zeroes on the diagonal. Conversely, any such matrix is the 
adjacency matrix of a graph. We denote by Z{G) the di- 
mension of the kernel (that is, the subspace of eigenvectors 
with eigenvalue 0) of the adjacency matrix of G. 

Induced subgraph: If V' C V, the graph with ver- 
tex set V' and edge set E' those edges in E with both 
extremities in V' is called the subgraph of G induced by 

v. 
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Random graph in the microcanonical ensemble: 

If y = {1,---,7V}, there are (^(^-i)/2) graphs with 
vertex set V and L edges (making a total of 2^(^^i)/2 
graphs with vertex set V). Saying that all (^^(^"^^/^^ are 
equiprobable turns the set of graphs on N vertices and L 
edges into a probability space whose elements we call ran- 
dom graphs in the microcanonical ensemble. This is the 
ensemble we use below for numerical simulations. 

Random graph in the canonical ensemble: Given 
a number p G [0,1], wc introduce ^'■^ independent 
random variables Sij, I < i < j < N , each taking value 
1 with probability p and with probability 1 — p. Saying 
that {i,j} is an edge of G if and only if Sij — 1 turns 
the set of all 2^(^~i)/2 graphs with vertex set V into a 
probability space whose elements we call random graphs 
in the canonical ensemble. This is the ensemble we use 
below for analytical computations. 

Connectivity, a: In the sequel, we are interested in the 
large TV limit with a finite limit a for ^ (microcanonical 
ensemble) or for p{N — 1) (canonical ensemble). The pa- 
rameter a is the average connectivity, the average number 
of neighbors of a given vertex in the random graph. 

If N{p{l~p))^^^ c» and i - p '^^1~^\ the ther- 
modynamic properties of a G{N, L) in the microcanonical 
ensemble and of a G{N,p) in the canonical ensemble are 
the same. This is in particular true iipN = a is kept fixed 
as iV ^ oo. 

3 Leaf removal process and the 
core of a graph 

Our aim is to define, for any (random or not) finite graph 
G, a remarkable subgraph which we call the core of G. It 
is obtained by leaf removal^ an operation that we define 
now. 

Leaf: A leaf of a graph G is a couple of vertices {v, w) 
such that {v,w} is an edge of G and w belongs to no 
other edge of G. Note that this is not the most standard 
definition and that (u, w) and {w, v) are both leaves if and 
only if {v, w} is a connected component of G. 

Bunch of leaves: A bunch of leaves is a maximal fam- 
ily of leaves with the same first vertex. The leaves of a 
graph can be grouped into bunches of leaves in an unique 
way. 

Leaf removal: If (w,w) is a leaf of G, and G' the sub- 
graph of G induced by V\{v,w}, we say that G' is ob- 
tained from G by leaf removal of {v,w). In other words, 
G' is obtained from G by removing vertices v and w, the 
edge {v, w} and all other edges touching v. Note that this 
operation can destroy other leaves of G and also create 
new leaves. See Fig. ^ for a pictorial example. 

Step by step leaf removal process: Start from a 
graph G. If G has no leaves, stop. Else, choose a leaf 
(w, w) and remove it, leading to a graph G'. If G' has no 



t U X t U X 




w y z y z 



Figure 1: In this example, the leaf (v,w) is removed, as 
well as the four edges touching v: the new graph G' is the 
subgraph of G induced by the five remaining vertices. Note 
that the vertex t is now isolated, and that a new leaf (z, y) 
has been created. 



leaves, stop. Else, choose a leaf (w',w') and remove it. 
This operation is iterated until no leaf remains. 

History: A sequence G, {v, w), G', {v' , w'), ■ ■ ■ associ- 
ated to a step by step leaf removal process is called an 
history. 

Isolated points, /; Core of a graph, G: The last 
term in an history starting from G is a graph which splits 
into a collection of isolated points /, and an induced sub- 
graph G of G without leaves or isolated points which we 
call the core of G. We denote the number of points in the 
core by Nc and the number of edges in the core by Lc. 

For these definitions to make sense, one has to show 
that the number of isolated points and the core are well 
defined, that is, do not depend on the choice of history. 
The formal proof is given in the appendix. 

Global leaf removal process: Start from a graph G. 
If it has no leaves, stop. Else select one leaf in every bunch 
of leaves. Remove from the vertex set V both extremities 
of all the selected leaves, and define V^^^ to be the set 
of remaining vertices. Let G*^^-* be the subgraph of G in- 
duced by V^^\ Note that the leaves of G^^^ (if any) are 
not leaves of G. In this operation, the vertices belonging 
to a bunch of G that were not selected become isolated 
points of G^^\ They remain isolated for the rest of the 
process. Iterate the procedure and define G^'^\G^^\--- 
until a graph without leaves is obtained. 

In the proof given in the appendix that the core is well- 
defined, the argument in step 3c implies in particular that 
leaf removals that take place in distinct bunches of a graph 
commute. It implies that the global leaf removal process 
leads to same core and number of isolated points as any 
step by step leaf removal process. 

While the global leaf removal process is convenient for 
analytical computations, the step by step leaf removal pro- 
cess is easier to implement on the computer. 
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4 Core percolation: infinite re- 
sults 

The global leaf removal process allows to compute the 
most salient characteristics of the leaf removal process, 
the fimctions i{a), c{a) and l{a). Remember that by def- 
inition, the number of isolated points after leaf removal 
is Ni{a) + o{N), the number of vertices in the core is 
A^c = Nc{a) + o{N), and the number of edges in the core 
is Lc — Nl{a) + o{N). The clue is an enumeration of all 
the configurations on the random graph that contribute 
extensively to the fundamental events in the global leaf 
removal process at step n (which goes from G'-"""'^' to 
G^"-'): emergence of a new isolated point, removal of a 
point, removal of an edgeQ. This enumeration is possible 
because the finite configurations of vertices and edges in 
the random graph with extensive multiplicity are tree-like. 
This implies that the problem has a recursive structure. 
The weight of a tree is chosen so as to reproduce the cor- 
rect random graph weight: a vertex has weight e^" and 
an edge has weight a. One has to be rather careful to 
avoid double counting and omissions; the examination of 
all cases is very tedious so we omit the details and simply 
outline the strategy. 

The key ingredient is a study of leaf removal on rooted 
trees. Starting from a rooted tree, we apply the global leaf 
removal process, with the convention that even if the root 
has only one neighbor, it is not counted as a standard 
leafQ. We let p„, n > be the generating function for 
rooted trees whose root becomes isolated exactly at step 
n of the global leaf removal process, and n > 1 be the 
generating function for rooted trees whose root is removed 
exactly at step n of the global leaf removal process. For in- 
stance, po counts rooted trees with an isolated root, hence 
trees with a single vertex, and po — e^". As another ex- 
ample, qi counts rooted trees whose root touches at least 
one standard leaf. Consider the trees whose root touches 
exactly k > 1 standard leaves, and I > other vertices. 
These I vertices can be seen as roots of non-trivial sub- 
trees of the original tree, so by definition they contribute 
to 1 — Po- So the weight is 

-a\k 



„(ae-")'= (a(l-po)y 



Hence 



qi = e 



EE 

A:>1 i>0 



(ae-"f (a(l-po))' 



fc! 



n 



1 - e" 



In the SRine Wciy, contributions to Pn or for larger n's 
can be analyzed in terms of the trees attached to the neigh- 

^ The method of Refs. M, Ul reUes on approximate differential 
equations that apply to a slightly different model of random graphs. 
It is very powerful, but less intuitive than the direct enumeration 
method that follows. 

^ However, a configuration where a neighbor of the root is a 
standard leaf is treated as usual. 



bors of the root, and the structure of these trees involves 
lower order contributions. In contributions to p„, the root 
has at least one neighbor whose attached tree contributes 
to qn and any number of neighbors contributing to qi or 
92 or • • • or qn-i- So 



Pr 



Analogously, in contributions to qn, the root has at least 
one neighbor whose attached tree contributes to p„_i and 
any number of neighbors none of which contributing to po 
or pi or • • • or Pn-i- So 



l)e 



a(l-p„_i- 



-Po) 



These two relations allow it to be shown that 

Pn = e2„+i - e2„-i forn > 
qn = e2„-2 - e2„ forn > 1, 

where e„(a) is the sequence of iterated exponentials, de- 
fined by 



e_i 



and 



for n > 0. 



(1) 



The events on the random graph that at step n of the 
global leaf removal process a given vertex becomes iso- 
lated, or that a given vertex disappears, or that a given 
edge disappears can all be interpreted in terms of the pre- 
vious configurations. In each case, the different possible 
contributions have to be taken into account, and also the 
rule that a root with a single neighbor can be touched by 
leaf removal has to be enforced. We omit this painful case 
by case analysis and only state the results. 

The explicit formulas for extensive contribution to the 
average number iVi„(a) of isolated vertices, 7Vc„(a) of non 
isolated vertices and Nln{a) of edges in are 



Cn{a) 



ein+\ + e2n 
' e2ri+l 

(e2n — e2ri-l 



62 

a 
2 



ae2„e2„-i 
ae2„e2„-i 

2 



1, 



(2) 



Now comes the crucial fact: when a < e, the sequence 
e„(a) converges to W{a)/a, where W{a) is the Lambert 
function, defined for a > as the unique real solution of 
the equation We^ — a. The function W{a) is analytic 
for a > 0. When a > e, W{Q)/a remains a fixed point of 
the iteration Eq. (Q) but it is unstable: the sequence {e„} 
is oscillating. However the even subsequence e2n and odd 
subsequence e2n+i are still convergent. The even limit is 
strictly larger than the odd limit. We define the functions 
A{a) and B{a) for a > by 

fim e2n = — and lim e2„+i = — . 

n — ^oo Oi n — >oo Q; 



Then (A, B) solve the system 

Ae^ = a, Be' 



(3) 



M. Bauer and O. Golinelli — Core percolation in random graphs: a critical phenomena analysis 



5 





/ ^ 




j J/V_^. 


o. 




/ , , , , 1 , , , , 1 , , , 





Connectivity a 

Figure 2: The special functions W{a), A{a) and B(a), 
which coincide for a < e. 

For a < e, the unique solution is A = B = W. For a > e, 
the previous solution becomes unstable and {A, B) is the 
solution of Eq. (||) selected by the rule A <W < B. This 
is summarized on Fig. |^. 

Taking the limit in Eq. (0) leads to 



i{a) 
c{a) 
l{a) 



A + B + AB 



-1, 



{B-A)il-A) 



a 



(4) 



2a 



For a < e, c(a) = l{a) — 0, and the core indeed has a 
size o{N). On the other hand, the core occupies a finite 
fraction c{a) of the vertices for a > e. The behavior of 
Eq (Q) is responsible for this geometric transition, core 
percolation, at a = e. As c{a) and l{a) — > when a — > e^, 
these functions are continuous but their derivatives are 
not: the transition is of second order. Note again that 
core percolation appears at a = e, contrary to backbone 
percolation, which occurs at a = 1. 

The fact that the core of a graph is an induced subgraph 
of the original graph allows to give a physicist's argument 
for the uniqueness of the giant component in the core. Fix 
a > e. If the core of the random graph contains two or 
more large connected components, there was no edge with 
extremities in two components in the original graph. But 
as the total size of the large connected components is of 
order N, the absence of such an edge is extremely unlikely. 

The behavior of thermodynamic functions close to the 
transition is the following. Writing a — e(l + e), for small 



negative e, 
A{a) = B{a) = 

1 3 n 19 

^+2^-16^+192- 
while for small positive e, 



185 
3072'^ 



2437 
61440 



A{a) = l-6l/V/2+2£-— £3/2 _ 3^2^0(^3/2)^ 

20 5 

Bia) = 1 +6l/V/2+2£+ ^£3/2 _ 3^2^0(^3/2)^ 

20 5 

For i(a), this implies that there is a jump only in the 
second derivative of at the transition, with 



i{a) 



3-e 1 



-£2 + 0(£3) for £ < 
£2 + 0(£3) for £ > 



while c(a) and l{a) have a jump in the first derivative at 
the transition, with 



c(a) 



and 



for £ < 

lis - M6ii^!£3/2 _ 11^2 _^ 0(^5/2) for £ > 



lia) 



12e_M 2 ( 3 



for £ < 
for £ > 0. 



The expansion for I contains only integral powers of e, 
and this may be related to the fact (see section ^ that 
the finite size corrections for the number of edges in the 
core Lc are slightly nicer than the ones for the number of 
vertices in the core Nc- The average connectivity of the 
core is 



^eff 



2l{a) _ B-A 
c(a) 



1-A 



= 2 



^,1/2+1, 

3^3 



+ -£ + 0(£3/2) 



for £ > 0, which implies that giant core component should 
look like a large loop for a close to e"*". The exponent 1/2 
in the first correction makes such a property quite difficult 
to see numerically at large but finite N. 

In the random graph model, the vertices do not live in 
any ambient space, and the notion of correlation length is 
ambiguous. This problem will reappear in the finite size 
scaling analysis of the next section. However, the emer- 
gence of the core is very reminiscent of critical phenomena 
in physics. In particular, the critical slowing down is ob- 
servable during the global leaf removal process. Indeed, 
the speed of convergence of the iterated exponential se- 
quence can be computed. One finds that for a ^ e, the 
convergence is exponential: the convergence rate ^^^(a) 
is given by the formula 



A + B 



log a. 
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and more precisely 

W _ ,f 

e„ (-)"we for a < e, 

a 

e2n+i-- - -ae-(2"+i)/« for a > e, 
a 

e2n~ — ^ 6e~^"/^ for a > e, 
a 

where a{a), w{a) and b{a) are positive functions (they 
coincide for a < e and 6e~^/^ = ae~^^'^). When a ^ e~ , 
£ ^ and when a e+, £ — 

At a = e, the convergence is algebraic, and 



e e 



6^/2 3 



„+i 21(6)1^ log, 



0(- 



1 



„l/2 ' 2n ^ ^ 80 7l3/2 ^„3/2' 

This leads to the asymptotics at a = e: 



3/2 



21 log n ^/l 
+ ( - 

n 



en V 4ni/2 80 n 



en V 80 n V n 



The first correction for c„ is more important that the one 
for In- Moreover the logarithms at a = e lead to suspect 
that the finite size analysis of the next section might also 
be complicated by logarithms. 

5 Numerical studies of the core 
percolation 

The analytical computations above have enabled us to lo- 
cate a phase transition at a = e. They give information 
concerning the critical region but do not exhaust all the 
critical exponents. So we made an extensive numerical 
analysis of the core using Monte-Carlo simulations. At 
the first step this can also be used to check the previous 
analytical results. But let us start with the numerical al- 
gorithm. 

5.1 Monte-Carlo algorithm 

Our Monte-Carlo simulations consist in generating lots of 
random graphs, removing leaves step by step, and study- 
ing the remaining cores and isolated points. More pre- 
cisely, for a given set of parameters (N,a), we generated 
random graphs in the microcanonical ensemble, with N 
vertices and L = Na/2 edges {L is rounded to the nearest 
integer value). In the microcanonical ensemble the total 
number L of edges is fixed (in contrast to the canonical 
ensemble in which L fluctuates). 



As we want to simulate graphs with up to 10^ and 
with an average connectivity a of order 0(1), we must 
use an algorithm which requires computer memory and 
time of order 0{N), not 0{N'^). With the microcanoni- 
cal ensemble, the program is simpler: a random graph is 
obtained by choosing at random L distinct edges among 
all the possible edges. From a Monte-Carlo point of view, 
the microcanonical ensemble has another advantage: the 
measurements fluctuate less. 

As the graph (or equivalently its adjacency matrix) is 
very sparse, it is stored in an array T of 2L integers, in- 
dexed by an array of -I- 1 integers; the set of vertices 
adjacent to the vertex v is the array section {T{i)} where 
K(v) < i < K{v + 1). Then the connectivity of v is 
K{v -|- 1) — K{v). This defines the array K, with the rules 
A:(1) = 1 and K{N + 1) = 2L + 1. 

Note that each edge {v,w} appears twice in T: once 
for V and once for w. This requires twice more memory 
than a storage method exploiting the symmetry of the 
matrix (in which the edges appear only once), but the 
computational task is faster because the adjacent vertices 
of a given vertex are simply obtained from arrays T and 
K. 

The leaf removal process is done leaf by leaf. Each time 
a leaf is removed, the adjacent vertices are examined: if 
new leaves appear, there are added to a list of potential 
leaves to be considered later. Each elementary leaf re- 
moval requires a computational time of order 0(1) and 
not 0{N). So the computational time for the global leaf 
removal is proportional to the number of removed leaves, 
which is bounded by A''/2. Then the total computational 
time for one random graph (generation and leaf removing) 
is N times a function of a. 

For each random graph, we have measured the num- 
ber of isolated points |/|, the size (number of vertices) 
of the core Nc, the number of edges of the core Lc and 
consequently the average connectivity of the core 2Lc/Nc- 
there are estimators of Ni{a), Nc{a), Nl{a) and cteff, 
as defined before. As usual in Monte-Carlo simulations, 
averages have been done over many random graphs, and 
their confidence intervals (or error bars) are estimated by 
the variance of the measurements. 

For each value of a, we have generated 10 000 graphs of 
size N = 100, N = 1000 and A^ = 10 000, and 1000 graphs 
of size A^ = 100 000 and TV = 1 000 000. At the transition 
value a = e, 10 000 graphs have been generated for all 
the sizes. The whole computation takes a few days on a 
medium Sun workstation without special optimization. 

5.2 Monte-Carlo results 

On Fig. |, Monte-Carlo averages of Nc/N, L^/N and 
2Lc/Nc are compared with the infinite N results, c{a), 
1(a) and a^ff. Errors bars are not drawn because they 
are smaller than the size of symbols. This figure is a typ- 
ical case of a second order transition. Far from the tran- 
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Figure 3: Monte- Carlo averages (symbols) and analytical 
results (solid line) for the size, edges and average connec- 
tivity of the core. 
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Figure 4: Monte-Carlo estimations of the variance of the 
size of the core Xc{<^) o-'^^d the variance of the number of 
edges of the core xi (o:) • 

sition, differences between finite N and thermodynamic 
(i.e. N — oo) functions are small. Finite size effects are at 
least of order because the analytical calculations do 
not take into account the loops of finite size: their number 
is 0(1), so their contributions are 0{1/N). The simplest 
example is the "triangle" subgraph made of three vertices 
and three edges, not connected to the rest of the graph. 
Obviously, the triangles have no leaf and belong to the 
core: their contribution to Nc is (ae~")^/2. 

We have verified that finite size effects are of order 
0{1/N) (but not larger) for c{a) and l{a) far from the 
transition. For oteff, this is probably true but less clear 
because fiuctuations are stronger. On the other hand, in 
the critical region a ~ e, finite size effects are larger and 
some critical exponents can be defined. They are discussed 
later. 
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Figure 5: Monte-Carlo estimations of Xa{cf) (variance of 
the average connectivity of the core) versus (a — e). The 
axes are labeled by decimal logarithms. 

We have also examined the variances of the size, number 
of edges and average connectivity of the core. For a not 
too close to e and large N, we expect that the fluctuations 
(square root of the variance) are of order 0(V7V) for Nc 
and Lc, and Oil/^/N) for cte//- So to obtain a large N 
limit, we define Xc(a) = Var(iVc)/7V, Xi(a) = Var(Lc)/iV 
and Xa{ct) = N Var(2ic/-^c)- These quantities are anal- 
ogous in the spin models to the magnetic susceptibility 
(equivalent to the fluctuations of the magnetization). 

On Fig. ^, Monte-Carlo estimations of Xc and xi show 
that Xcici) and xi{(^) have a finite limit for N = oo when 
a > e, a vanishing limit when a < e and diverge when a 
approaches the critical value e. By analogy with c{a) ~ 
l{a) ~ (a — e) and Oieff — 2 ~ (a — e)^/^ when a ^ , 
power laws are expected for the divergences. So, we define 
two critical exponents p and p' by 



Xc(a) Xi(a) 



ia-e)-", 



(5) 
(6) 



when a ^ e~^ . The exponent p could be numerically mea- 
sured by plotting log(xc) or log(xi) versus log(Q!— e). Un- 
fortunately, this gives poor results because the curvature 
of the plot is too important, with a slope p changing from 
1 to 0.5. But it is possible for tte//- Fig. ^is a log-log plot 
of Xa (a) versus (a — e) . Symbols are lined up correctly 
and the global slope gives the estimation 

= 1.5(1). 

The studies of isolated points are resumed on Fig. ^. 
Monte-Carlo averages of \I\/N are compared with the in- 
finite N results, i(a): errors bars and finite size effects 
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Figure 6: Monte- Carlo averages \I\/N and variance Xi(ct) 
of the number of isolated points. The solid line is the an- 
alytical result i{a) for TV = oo. 



are so small that they are not visible. On the other hand, 
the variance Xi{'^) — Var(|/|)/Af shows bigger statistical 
fiuctuations, but the finite size effects remain small. This 
variance does not diverge anywhere. However we see a 
cusp when a — > e+ compatible with 

Xi(e) ~ Xiia) ^ {a- ef 

with estimations xA^) — 0.095(5) and r = 0.6(1). As 
T < 1, the first derivative is infinite at a — e^ . 



5.3 Finite size scaling 

We now concentrate on the finite N behavior, first exactly 
at the transition a — e and then in the critical region 
around this transition. 
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Figure 7: Top to bottom: the average connectivity (mean 
and width), the number of edges (mean and width) and 
the size (mean and width) of the core versus the size N of 
the random graph. The axes are labeled by decimal loga- 
rithms. The negative slopes are measurements of ~<j) (for 
the connectivity) and uj ~ 1 (for size and edges). 



By analogy with the classical percolation transition at 
a — 1 where the size of the largest connected compo- 
nent is of order 0{N'^^^) and its average connectivity 
is 2 -I- 0(1/A^^/'^), we postulate the existence of other crit- 
ical exponents u; and </> defined by 



N' 



(7) 
(8) 



when A'^ — > oo at a = e. This hypothesis is tested on 
Fig. 0: data are correctly fitted by power-laws with 



Lu = 0.63(1) and 



0.21(1). 



Of course, if the large N behavior is modified by a (power 
of a) logarithmic function, the true values of the exponents 
are different than their apparent values when A'^ is large 
but finite. Here these exponents are determined by consid- 
ering the averages of the Monte-Carlo measurements. The 
width a of their distributions are also plotted on Fig. ^: 
means and widths have similar slopes. Consequently 



Xc(e) ~ xKe) ^ and Xo(e) ^ iV 



1-20 



(9) 



diverge when N ^ oo at a — e. 

As widths and means of the Monte-Carlo measurements 
are of the same order, the distributions remain broad in 
the large N limit at the transition. On the contrary, 
when a e, the distributions are sharp. On Fig. 0, 
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Figure 8: Cumulative distribution functions of the size, the 
number of edges and the average connectivity of the core, 
as functions of their respective scaled variables. We have 
set UJ = 0.63 and (p — 0.21. 
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the cumulative distribution functions Prob(A'^c/-^'^ l£ x), 
Fioh{LjN'^ < x) and Proh{{2LjNc ~ 2)N'>' < x) are 
plotted as functions of the scahng variable x for a = e. 
We observe that the curves converge when N is large to 
scaling distributions (independent of N) and this confirms 
the hypotheses Eq. (0) and Eq. (|). 

When X becomes large, these scaling distribution func- 
tions decrease like a Gaussian. Consequently, they are 
not large distributions in the sense that every moment is 
finite, in agreement with Eq. (||). On the other side, these 
functions seem to be power laws for small x. This allows 
to define another exponent S with 



Froh{Ne/N'^ < a;) - Prob(ic/A^" < x) x^ 



when a: ^ 0. Our estimation is 



(10) 



S = 0.36(3). 

The numerical values suggest that lu + 6 = 1, but we have 
no argument to explain it. 

By considering the probability that the core of a random 
graph is void at a = e, we measured a new exponent 



r] = 0.25(1) 



where rj is defined by 



Prob(A^c = 0) N-"^. 



The limit a; ^ in Eq. ( p_OD gives the conjectured rela- 
tion rj = LoS, which is numerically acceptable. With the 
hypothesis lu + d = I, it gives r] — uj{l — uj). 

We have also considered Prob(Lc = Nc), i.e. the proba- 
bility that the average connectivity of the core is exactly 2. 
In this case, the core is made of one or several simple loops 
without branching. The Monte-Carlo study indicates that 
the large N limit could be a pure number: 0.12(2). More 
intensive simulations are needed to confirm (or invalidate) 
this result. 

More relations between critical exponents can be ob- 
tained by using the finite size scaling hypothesis j^: in 
the vicinity of the transition, the behavior of finite ran- 
dom graphs is determined by the scaling variable 

y = (a - e)iV^ 

where is a positive scaling exponent. 

First we shortly resume the scaling theory for a general 
quantity Q{N, a), for size N and connectivity a. For N — 
oo, let us suppose that 

Q{a) ~ (a-e)T 

when a —^ e~^ (7 could be positive or negative). Then we 
expect in the critical region that 

Q{N,a)^N~^' Q{y) 
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Figure 9: Finite size scaling for the edges of the core in 
the critical region. 

where the scaling function Q{y) is defined by 
Q(y)= lim iV^^ Q(iV, e + y/N'), 



which behaves as 



Qiy) '-"^^ V'. 



As y = exactly at the transition a = e, 

So the exponent of finite size effects at the transition and 
the exponent of critical behavior for TV = 00 in the vicinity 
of the transition are related by 9. This remark is useful 
only if different quantities share the same 6. For usual 
models of statistical physics with a 2-D or 3-D geometry 
(like classical spin systems), the exponent describes the 
divergence of the correlation length ^. So the uniqueness 
of ^ implies the uniqueness of 0. Unfortunately for ran- 
dom graphs, we have no equivalent length and no simple 
phenomenological interpretation for 6. However we shall 
assume that 9 is unique. 

As we have computed exact N — 00 formulae in Sect. ^, 
we can directly study Q{N,a) — Q{a), i.e. the finite size 
effects. The scaling function is now Q{y) — y'^ and is max- 
imal around y ~ 0. From a numerical point of view, the 
analysis becomes easier than the one of the monotonic 
function Q{y). 

Let us now consider Nc/N and Lc/N. As c{a) ^ l{a) ~ 
(a — e) when a — s- e^, for these quantities 7 = 1. Then 

6 = l-u;. 

On Fig. ^, with the choice 9 = 0.37 (induced by the nu- 
merical measure of uj), N^{Lc/N — l{a)) is plotted versus 
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e, s 


UJ (j) 


P 


P' 




MC 


0.36(3) 


0.63(1) 0.21(1) 


1.5(1) 


0.25(1) 


a 


1/3 


2/3 1/6 


1 


2 


2/9 


b 


0.37 


0.63 0.185 


0.70 


1.70 


0.233 


c 


2/5 


3/5 1/5 


1/2 


3/2 


24/100 


d 


0.42 


0.58 0.21 


0.38 


1.38 


0.244 



Table 1: Critical exponents for the geometric phase transi- 
tion when the average connectivity of a large random graph 
is a — e. The line "MC" displays the results of Monte- 
Carlo simulations. The lines "a-d" are a few sets of val- 
ues compatible with scaling relations. The line "c" has our 
preference (see text). 



y = {a — e)N^ . We see that data are well superposed: 
they draw the scaling function. Note that 9 is the unique 
fitting parameter for this figure. 

Let us now consider the variances Xc{oi) and x/(q;). Us- 
ing Eq. (||) and Eq. (H), the finite size scaling hypothesis 
gives the new relation 

pe^2uj~ 1. 

The same analysis with the average connectivity of the 
core can be done. As aeff — 2 ~ (a — e)^/^, the cor- 
responding 7 = 1/2. Using Eq. (^), the scaling relation 
is 

6 = 2(1). 

For the variance of the connectivity Xa , Eq. (|^) and Eq. (^ 
give 

p'e ^1- 20. 

By eliminating 9, other relations are obtained: p' — p -\- 1 
and 2(p + UJ = 1. 

With these four scaling relations and the Monte-Carlo 
determinations, we will now try to conjecture the exact 
values of these exponents. Table |l| resumes the following 
considerations. The results of Monte-Carlo simulations 
are given in line "MC" . Other lines are suggestions for sets 
of exponents compatible with the four scaling relations. 

The line "b" is obtained by using the numerical deter- 
mination of UJ and the scaling relations. In particular, it 
gives 9 — 0.37(1). The line "d" uses the numerical de- 
termination of (j>; it gives 9 = 0.42(2). As the difference 
between these two values of 9 is about twice larger than 
the uncertainty, we cannot definitely conclude whether the 
size and the connectivity of the core share the same scaling 
exponent 9 or not. 

The line "a" is obtained by assuming that uj — 2/3 and 
9 = 1/3, which are the values j|] of the corresponding 
exponents for the classical percolation of random graphs 
at a = 1. This hypothesis seems incompatible with the 
Monte-Carlo estimations of (p and p' . Furthermore the av- 
erage connectivity of the giant component near the classi- 
cal percolation transition behaves with 2-^0{{a—l)'^) — 
to be compared with 2 + 0((q! — e)^/^) for the core — and 



consequently the corresponding exponent (j) is 29 — 2/3 
(but not 6'/2 = 1/6). 

This gives strong evidence that the analogy between 
exponents of percolation and core transitions cannot be 
complete and that the effective low energy field theory de- 
scriptions in the vicinity of the transition are different. In 
particular, they may well have a different upper critical 
dimension. 

The line "c" assumes that the exponent p' = 1.5(1) is 
exactly 3/2. This is very attractive because exponents are 
simple rational fractions and the value = 2/5 is between 
the numerical estimations 0.42 and 0.37. 

Of course, nothing in the theory of critical phenom- 
ena requires that critical exponent are rational numbers 
with small numerators (for a recent example, see Ref |10 ). 
However, if we want conciliate numerical simulations, the- 
oretical considerations and simple rational fractions, we 
are led to conjecture uj = 3/5, (j) = 1/5, p = 1/2, p' = 3/2, 
6^9^2/5andr] = 6/25. 

To reduce the uncertainties in Monte-Carlo simulations, 
bigger size N are needed, in particular in the case where 
the large N behavior would be affected by logarithmic 
laws. Moreover, we hope to progress in analytical methods 
for calculating these exponents as well. 

6 Applications 

We now discuss two applications of the structure of the 
core: the first one to the conductor-insulator transitions 
in random graphs and the second one to combinatorial 
optimization problems. 

6.1 Localization on random graphs 

We denote by Z{G) the dimension of the kernel (the sub- 
space of eigenvectors with eigenvalue 0) of the adjacency 
matrix of the graph G. It is known [0| that Z{G) is in- 
variant under leaf removal (see Ref. |12(| for a proof and an 
application to random trees). As the adjacency matrix is 
block-diagonal with one block per connected component, 
Z{G) is additive on connected components. These two 
properties imply that 

Z{G) = Z{C) + Z{I) = Z{C) + |/| < iV, + |/|. 

The last equality is because the adjacency matrix vanishes 
for a collection of isolated points. This analysis applies to 
any graph, and remains valid after averaging. Even if the 
probability distribution is not that of a random graph, we 
see that as soon as leaves appear with a non vanishing 
weight (this is true for instance if the probability distribu- 
tion is that of a lattice with impurities), the spectrum of 
the adjacency matrix has a delta peak at the origin. How- 
ever, the fact that, as we show below for random graphs, 
leaf removal accounts for the full weight of this delta peak 
seems rather non generic. 
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Taking the average of these formulae for random graphs 
and using our results on the core, we get that Z{G) = 
Nz{a) + o{N) for a large random graph G with average 
connectivity a, with 

z(a) = i{a) for a < e, (11) 
i{a) < z{a) < c(a) + i{a) for a > e. 

It has been argued in Rcf. that 



for all values of a. Combined with our present results, this 
means that 

z(a) = i{a) for all values of a. 

We may interpret Eq. (pj] ) as an independent proof of 
Eq. (|2|) for a < e and we may also infer that the adjacency 
matrix of the core of a random graph at a > e has a kernel 
of size o{N). 

In Ref. §, it was shown that e is in a domain of the 
a parameter for which delocalized vectors are responsible 
for a finite fraction of the size of the kernel. Imagine that 
we start to increase a very slowly from a = e by adding 
randomly new edges one by one to the random graph. 
We watch the competition between the core (which, we 
have seen, carries few elements in the kernel), the local- 
ized eigenvectors in the kernel and the delocalized ones. 
The competition between the core and the full kernel is not 
very strong: when n edges, with 1 << n << A^, are added 
to the graph, the core grows in average of 24 e~^n ver- 
tices, while e~^n vectors in the kernel are lost. However, 
by the results of Ref. about delocalized eigenvec- 
tors disappear and localized eigenvectors replace these. 
It is intuitive that delocalized eigenvectors in the kernel, 
which live on large structures on the random graph, have 
a high probability to be perturbed by the growing core. 
But the precise mechanism by which their extinction is al- 
most compensated by new localized vectors in the kernel 
remains to be elucidated. 

The concept of leaf removal process can also be used 
to analyze the localization-delocalization transitions that 
occur at ad « 1.42153 and « 3.15499. As shown in 
Ref. , the localized eigenvectors in the kernel live on def- 
inite structures that can be drawn on the random graph. 
These structures are finite (connected) trees that 

• can be bicolored brown-green in such a way that all 
vertices with or 1 neighbor are green and all neigh- 
bors of the green vertices in the random graph belong 
to the tree; the neighbors of the brown vertices on the 
other hand, can be anywhere on the random graph. 

• are maximal, i.e. are not part of a larger tree with 
the same properties. Observe that each isolated point 
is maximal. 



We can put marks on all vertices belonging to such struc- 
tures and build histories of leaf removal processes such 
that the initial steps remove only marked vertices, and 
after these steps, the only remaining marked vertices are 
now isolated. 

Then if a is small (a < ad) or large {a > a^), the 
number of isolated marked points is Ni(a) + o{N). This 
implies in particular that at most o{N) bunches of the 
remaining graph contain more than one leaf. 

On the other hand, if a G\ad, ar[ the number of isolated 
marked points is less than Ni(a): a number of order N of 
non-trivial bunches will have to appear somewhere during 
the rest of the leaf removal process. 

6.2 Vertex covers and matchings 

Apart from the size of the kernel, several other interesting 
quantities attached to graphs behave rather simply un- 
der leaf removal. We mention two, which are related to 
combinatorial optimization problems. 

Vertex cover: A vertex cover of a graph is a subset 
of the vertices containing at least one extremity of every 
edge of the graph. We denote by X{G) the minimal size 
of a vertex cover of a graph G. 

There is a nice "practical" interpretation of X{G). 
Imagine that the edges of the graph are the (linear) cor- 
ridors of a museum, the vertices corresponding to ends of 
corridors. A guard sitting at a vertex can control all the 
incident corridors. So X(G) is the minimum number of 
guards needed to control all corridors of the museum. 

Edge disjoint subset, matching: An edge disjoint 
subset is a subset of the edges such that no two edges in 
the subset have a vertex in common. This is also called a 
matching. We denote by Y{G) the maximal size of edge 
disjoint subset in a graph G. Finding such a maximal edge 
disjoint subset is called the matching problem. 

The determination of X{G) or Y{G) for a given G are 
archetypal of two classes of optimization problems. While 
it is known that the matching problem can be solved in 
polynomial time (see e.g. Ref. and references therein), 
the museum guard problem is in the Non-deterministic 
Polynomial (NP) class because no polynomial time algo- 
rithm is known to solve it (and such an algorithm is not 
expected to exist, this is related to the famous Pt^NP con- 
jecture), but given a candidate solution, it is easy to check 
in polynomial time that it is correct. 

When G is a large random graph, one may ask for ther- 
modynamic solutions of these problems, when only the 
extensive contributions to X{G) or Y{G) are considered 
as relevant. This leads to the following definition: 

Vertex cover fraction x{a), matching fraction 
y{a): For fixed a, the vertex cover fraction x{a) and the 
matching fraction y{a) are the limits of the averages of 
X{G)/N and Y{G)/N when G is a random graph of size 
N and iV oo. 
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Let us note however that the one can probably exhibit 
combinatorial optimization problems for which the ther- 
modynamic solution can be obtained in polynomial time, 
but taking into account the non-extensive remainder is 
prohibitively long. 

The replica trick has been used by Hartmann and Weigt 
to obtain a lower bound for x{a) in a series of papers 
|l3| , [l^ . They have shown that for a < e, the replica 
symmetric solution is stable, whereas it become unstable 
for a > e. The replica symmetric solution leads to 



1 



2W + 
2a 



for a < e. 



This relation has to break down somewhere, because a 



result of Frieze |15 implies that 



2 1 

x{a) = 1 (log a — log log a — log 2 + 1) + o(— ), 

a a 

whereas W ~ logo; for large a, so that the asymptotics of 
l_2W^Ml starts with l-^^f^. Weigt and Hartmann 
have also used a good algorithm to get an approximation of 
a minimal vertex cover. The idea is essentially to look for 
a vertex with a maximal number of incident corridors and 
put a guard there. Then remove the site and the adjacent 
corridors and iterate. This is always fast, but gives only 
an upper bound for X{G). This can be refined, but then 
the algorithm needs a very long time when a > e. 

We show how leaf removal can be applied to the mu- 
seum guard problem. If {v,w) is a leaf of G, there is a 
minimal vertex cover with a guard at v. This is because 
any vertex cover has a guard at v or at w, and a guard 
at V makes the guard at w useless. So if a minimal vertex 
cover has a guard at w, moving it to v yields another min- 
imal vertex cover. Isolated vertices do not need guards. 
The leaf removal of {v, w) leading from G to G" removes 
exactly the corridors controlled by the guard at v. Hence 
X(G)=X{G') + l. 

An analogous argument applies to maximum edge dis- 
joint subsets. Indeed, if {v, w) is a leaf of G, there is a 
maximal edge disjoint subset that contains {v,w}. This 
is because if no edge of an edge disjoint subset touches v, 
this edge disjoint subset is not maximal (it can be com- 
pleted with {v,w}), and if an edge disjoint subset has an 
edge that touches u, replacing this edge by {v,w} yields 
an edge disjoint subset of the same size. The leaf removal 
of (v, w) leading from G to G' removes, apart from {w, w}, 
exactly the edges that cannot belong to any edge disjoint 
subset containing {v,w}. Hence Y{G) — Y{G') + 1. 

Some general inequalities can be proved. For instance 
X{G) > y{G) (the triangle is an example when the in- 
equality is strict) and Z{G) > N — 2X{G). However, 
Z{G) — N + 2Y{G) can have any sign (negative for the 
triangle but positive for the square). 

Anyway, at each leaf removal, two vertices are removed, 
so Z(G), N{G) - 2X{G) and iV(G) - 2y(G) are invariant 



under leaf removal. Moreover, X and Y vanish for unions 
of isolated vertices. To summarize, 



X{G) = X(G) + 
Y{G) = r(G) + 



N-Nr- 1/1 



N~N,~ \I\ 



Z(G) = Z(C) + \1\. 

Karp and Sipser ^ have devised an approximate al- 
gorithm to get a good (if not optimal) matching. There 
are two possible transformations: 

(1) Remove a leaf, 

(2) Choose an edge at random, remove it together with 
its extremities and all edges touching the extremities, 
and they are are performed according to the following rule: 

At each step, until the graph is empty, do (1) if possible 
and if not, do (2). So starting from G, one applies (1) until 
the core of G is obtained. Then (2) is applied as long as no 
new leaf appears. As soon as a graph with leaves appears, 
apply (1) to reach the core of this new graph, and so on. 

At each step an edge is singled out, and by construction, 
the set of all these edges defines a matching, i.e. an edge 
disjoint subset. 

When G is a random graph with a < e, the core is small 
{o{N) for large N). Thus, 



x(a) = y(a) 



1 - z{oi) 



= 1 - 



2W ^W^ 
2^ 



for a < e, 



which gives in particular an independent proof of the re- 
sult of Weigt and Hartmann Note that in this case, 
the approximate algorithm is to put a guard at a vertex 
touching as many edges as possible, then remove it and 
iterate, whereas the exact algorithm is almost the oppo- 
site, namely, put a guard at the connected end of a leaf, 
remove the leaf and iterate. Leaf removal gives a very fast 
algorithm (linear in TV if the graph is properly encoded) 
to construct a minimal vertex cover when a < e. 

Karp and Sipser have shown that for a large random 
graph with a > e, their algorithm for matching finds with 
high probability a matching of about A^c/S edges in the 
core. This is a lower bound for the matching number of 
the core, but at the same time, this is the maximum pos- 
sible. So this shows at once that the core has a thermo- 
dynamically perfect matching, and that their algorithm is 
thermodynamically optimal. Hence 



y{a) = 1 - 



A^B^ AB 



so that the relation y{a) 



2a 



for any a 



is valid for every a: the 



fact that the core does not contribute thermodynamically 
to zero eigenvalues and the fact that it has a thermody- 
namically perfect matching are closely related. 

If a > e, leaf removal stops while an extensive number of 
edges are still present: this gives a lower bound for x{a) 
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which is very poor at large a. But it seems clear that 
the replica symraetry breaking [ p^ at a = e is tightly 
connected to the fact that the structure of the core of a 
random graph is more complicated than the structure of 
the parts eliminated by leaf removal, so that a more refined 
Parisi order parameter is needed to describe the phase a > 
e. While we have seen that the matching parameter y{a) 
and the kernel-size parameter z{a) are well understood 
and closely related, the exact evaluation of vertex cover 
parameter x{a) remains as a challenge. 

7 Conclusions 

In this paper we have presented a physicist's analysis of 
a deep feature of random graphs : a geometric second 
order phase transition with the emergence of a core at 
threshold a = e. This core is the residue when leaves 
(i.e. points with a single neighbor and this neighbor) and 
isolated points are iteratively removed. We have argued 
that the core is dominated by a giant component. The 
dominant contribution of the large N behavior of the rele- 
vant thermodynamic quantities was computed exactly by 
a direct counting method. We have studied numerically 
the finite size behavior of the core, defined a variety of new 
critical exponents and obtained approximations consistent 
with their mutual relationships. Our analysis excludes 
the exponents of standard percolation in random graphs. 
Finally, we have applied our results to the localization 
transition and to combinatorial optimization problems on 
random graphs. 

However, some more analytical or numerical work is 
needed to identify without any doubt the exponents for the 
phase transition at a — e. An open question is the interac- 
tion between the emergence of the core and the delocalized 
eigenvectors of the adjacent matrix with eigenvalue 0. A 
fine study of the distribution of the sizes of the connected 
components of the core could be done with Monte-Carlo 
simulations: for a > e, we expect a giant component, plus 
a finite number of finite components. Moreover we have 
shown that the number of eigenvectors with eigenvalue 
living on the core is o{N) and numerical simulations could 
give the precise asymptotics. 

Finally, as the core percolation appears in a simple 
model of random graphs, which is governed only by one 
parameter, the average connectivity a, we expect that this 
transition is universal in the sense that some characteris- 
tics of this transition (second order, critical exponents, 
etc., but not the precise value of a = e at the transition) 
could be seen in other models or real materials. 

Acknowledgments: We are very indebted to Graham 
Brightwell and Boris Pittel for drawing our attention to 
the mathematical literature on matching in random graphs 
and its relevance for our work. 



Appendix: The core is well-defined 

In this appendix, we show by induction on the number N 
of vertices of G that the property 

Vn = "the number of isolated points |/| after 
leaf removal and the core C of a graph G on N 
vertices do not depend on the history" 

holds for any N > 0. 

To start the induction, if G has or 1 vertex, there is 
no leaf hence there is only one history, so Vq and Vi are 
true. Suppose now that Vq, - ■ ■ , Vn-i are proved and take 
a graph G on iV > 2 vertices. We distinguish several cases: 

1 If G has no leaf, there is only one history so Vn is true 

for G. 

2 If G has exactly one leaf, all histories start with the 

same first leaf removal, lead to the same G' for which 
Vn-2 is true by the induction hypothesis, so Vn is 
true for G. 

3 If G has at least two leaves, we compare two histories: 

Hi = G, (vi,wi),G[, ■ ■ ■ and 

= G,{V2,W2),G'^,---. 

3a If {vi,wi} = {v2, W2}, G'l — G'2 to which the induction 
hypothesis Vn-2 applies, so that Ci = C2 and = 

3b If vi ~ V2 but wi ^ W2 (the two leaves are distinct 
but belong to the same bunch, this can happen only if 
N > 3), then G'l has W2 as an isolated point, G'2 has 
wi as an isolated point, but G[/{w2} = G'2/{wi} = 
G", say. Further leaf removals can take place only on 
G", to which the induction hypothesis Vn-3 apphes, 
so again, Gi = G2 and = I/2I. 

3c Suppose that (vi, wi) and {v2, W2) do not belong to the 
same bunch. This can happen only if iV > 4. Then 
{vi,wi) is a leaf of G2 and (^2, W2) is a leaf of G'^. The 
graph obtained from Gj by leaf removal of {vi,wi) 
and the graph obtained from G[ by leaf removal of 
(w2, W2) are the same, because they are both equal to 
G", the subgraph of G induced by V/ {vi,wi,V2,W2}- 
Take a history H" for G" . It can be completed to give 
two histories for G, H'l = G, {vi,wi), G'l, {v2,W2), Ti." 
and H2 = G, (112, W2), G2, (wi,u'i),'W. The induc- 
tion hypothesis Vn-2 apphes to G'l so Ti'l and Tii 
have to end with the same core and the same num- 
ber of isolated points. The same is true for Tij and 
7^2 because the induction hypothesis Vn-2 applies to 
G2, and also for Ti'l and TL2 because the induction 
hypothesis Vn-a applies to G". By transitivity, Tii 
and 7^2 end with the same core and the same number 
of isolated points: Gi = G2 and = |/2|. 
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All possibilities have been examined, hence whatever the [13] A.K. Hartmann and M. Weigt, Statistical me- 



is true for G. This completes 



number of leaves of G, Vn 
the induction step. 
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